Character Extraction from Interfering Background - Analysis of Double-Sided Handwritten Archival Documents
نویسندگان
چکیده
The sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage poses a serious problem to human readers or OCR systems. This paper addresses this problem through the recovery of content on the front side of a page from the interfering image caused by the handwriting on the reverse side. First, by adapting the Gaussian stochastic model, the interfering model based on norm-orientation-discontinuity is proposed in analyzing the properties of the interfering strokes. Secondly, an improved canny edge detector with edge norm-orientation similarity constraint is proposed. At the same time, two low thresholds are used to detect edges instead of a single low threshold. This improvement could link weaker foreground edges without introducing noises in the overlapping/overshadowed area. The proposed algorithms perform well regardless of the intensity differences between the image on the front side and the interfering image from the reverse side. The segmentation results of real images are shown and evaluated
منابع مشابه
Segmentation and Analysis of Double-Sided Handwritten Archival Documents
Historical handwritten documents are preserved in good condition in many national archives or libraries. One problem that many archivists are facing is the sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage. This paper addresses this problem and develops a novel algorithm to extract clear textual images from interfering and overlapping a...
متن کاملText Extraction from Historical Handwritten Documents by Edge Detection
Many national archives or libraries keep large amount of historical handwritten documents. One problem that many archivists are facing is the sipping of ink through the pages of certain double-sided handwritten documents after long periods of storage. The result is that the handwritten characters from the reverse side appear as noise on the front side and even interfere with the front side char...
متن کاملA wavelet approach to double-sided document image pair processing
In this paper, we present a novel method for processing double-sided historic handwritten documents using wavelets. The method is specially designed to remove the interfering strokes from the reverse side due to ink sipping through pages after long periods of storage. The proposed method works by first matching both sides of a document page such that the interfering strokes are mapped with the ...
متن کاملRestoration of Archival Documents Using a Wavelet Technique
This paper addresses a problem of restoring handwritten archival documents by recovering their contents from the interfering handwriting on the reverse side caused by the seeping of ink. We present a novel method that works by first matching both sides of a document such that the interfering strokes are mapped with the corresponding strokes originating from the reverse side. This facilitates th...
متن کاملCharacters Extraction from Strings on a Document Image Using Handwritten Marks on Touch Screen
We argued validity of Tablet PCs in the fields of E-learning, and this paper discussed a new character extraction system from strings on the document image using handwritten marks on touch screen. As the first step of this study, we proposed the method to identify handwritten notes/marks to associate strings on the documents with handwritten marks. In this paper, experiments using actual scanne...
متن کامل